On the Problem of 'aboutness' in Document Analysis

نویسنده

  • W. John Hutchins
چکیده

One of the most crucial problem areas of information science concerns the identification of what documents are 'about'. This paper seeks to define the notion of 'aboutness ' within the context of recent work in text linguistics. It describes, first, the essential communicational structures of sentences, paragraphs and texts in terms of theme, rheme and thematic progression, connectors of clauses and sentences, and semantic progression. It then identifies the basic features of the global structures of narrative and expository texts, describes the interaction of macroand microstructure in the interpretation of texts and the role of presupposed 'states of knowledge ' in both text production and text comprehension. Finally, it is argued that for the purposes of information systems the 'aboutness' of documents is to be found among the presuppositions of authors concerning the knowledge of their potential readers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Implementation of Symbolic Aboutness Theory

Today information can be globally shared via the Internet and can be accessible from anywhere in the world. The increasing complexity and size of the WWW urges the need of more effective mode for information processing techniques such as information retrieval and filtering, information summarization, topic segmentation, data mining and information discovery, etc. All of them can be fundamentall...

متن کامل

How nonmonotonic is Aboutness ?

The notion of aboutness is fundamental to information retrieval. Assume there is a document d which is about query q. Now, if information is added to d yielding ~ d, the question arises whether document ~ d is about q? In other words, is aboutness monotonic with respect to information composition? This article shows that aboutness does have nonmonotonic character with respect to composition.

متن کامل

How Nonmonotonic Is Aboutness? How Nonmonotonic Is Aboutness?

The notion of aboutness is fundamental to information retrieval. Assume there is a document d which is about query q. Now, if information is added to d yielding ~ d, the question arises whether document ~ d is about q? In other words, is aboutness monotonic with respect to information composition? This article shows that aboutness does have nonmonotonic character with respect to composition.

متن کامل

Salience-Based Content Characterisation Of Text Documents

Summarisation is poised to become a generally accepted solution to the larger problem of content analysis. We offer an alternative perspective on this problem, by tackling the complementary task of content characterisation; our motivation for doing so is to avoid some of the fundamental shortcomings of summarisation technologies today. Traditionally, the document summarisation task has been tac...

متن کامل

Deciding Term Aboutness Probabilistically

Information retrieval is the quest to nd those information objects relevant to a given information need. Relevance is a diicult notion to deene operationally. As a consequence information retrieval mechanisms are typically driven by the decision of when one information carrier (e.g. a document) is about another (e.g. a query). As documents and queries are typically complex representations built...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1977